The propensity to sign-track is associated with externalizing behavior and distinct patterns of reward-related brain activation in youth

Externalizing behaviors in childhood often predict impulse control disorders in adulthood; however, the underlying bio-behavioral risk factors are incompletely understood. In animals, the propensity to sign-track, or the degree to which incentive motivational value is attributed to reward cues, is associated with externalizing-type behaviors and deficits in executive control. Using a Pavlovian conditioned approach paradigm, we quantified sign-tracking in 40 healthy 9–12-year-olds. We also measured parent-reported externalizing behaviors and anticipatory neural activations to outcome-predicting cues using the monetary incentive delay fMRI task. Sign-tracking was associated with attentional and inhibitory control deficits and the degree of amygdala, but not cortical, activation during reward anticipation. These findings support the hypothesis that youth with a propensity to sign-track are prone to externalizing tendencies, with an over-reliance on subcortical cue-reactive brain systems. This research highlights sign-tracking as a promising experimental approach delineating the behavioral and neural circuitry of individuals at risk for externalizing disorders.


Results
Pavlovian conditioned approach. Our primary goal was to establish a Pavlovian conditioning paradigm comparable to those used in animal models to identify sign-tracking and goal-tracking phenotypes in youth. We used the Pavlovian conditioning paradigm described by Flagel et al. 32 for rodents and adapted for humans by Joyner et al. 30 as a basis for the current study (Fig. 1a). An a priori power analysis based on the index of Pavlovian conditioned behavior found in an existing study of sign-tracking and goal-tracking in humans 27 indicated that a sample of 40 would be sufficient to detect a large effect (d = 0.91) with 80% power (see Methods for the full a priori power analysis report). Participants (N = 40, Table 1) completed 40 response-independent conditioning trials consisting of lever (CS) presentation and subsequent reward (US; $0.20 token; all monetary units in USD) presentation. A randomly selected inter-trial interval (ITI) followed each CS-US trial.
Feasibility was assessed by identifying both qualitative ( Supplementary Fig. S1) and quantitative markers of child engagement and learning. We identified individual differences in the attribution of incentive salience to reward-paired cues (sign-tracking or goal-tracking behaviors) via a Pavlovian Conditioned Approach (PavCA) index based on previously used models in animal studies 33 derived from the number and timing of physical contacts to the CS and US during CS-presentation and ITI phases.
Consistent with animal models 33 , we classified categorical phenotypic groups using a PavCA value during the CS-period of 0.5 or greater (STs) and less than −0.5 (GTs). PavCA scores during the CS-period ranged from −0.18 to 0.95 (m = 0.41, sd = 0.40; Supplementary Table S1) and the distribution indicated that participant behaviors were skewed toward either neutral or lever-directed behaviors (Fig. 1b), therefore we were able to identify STs, but not GTs. Nineteen participants were identified as STs (m PavCA = 0.79, sd = 0.13), and since no participants had a PavCA value less than −0.5, we classified the remaining 21 participants as non-sign-trackers (non-ST; m PavCA = 0.06, sd = 0.14) and used these as our comparison groups (see Supplemental Materials  To assess learning of conditioned responses to the CS, we examined behavioral responses to the CS using twosided linear mixed effects models with the factors time (Block 1-4), phase (CS, ITI), and phenotype (ST, non-ST) for lever-and reward-directed behaviors ( Supplementary Fig. S2). Because CS lever contacts correlated positively with ITI lever contacts (r = 0. 59 www.nature.com/scientificreports/ for lever-and reward-directed behaviors (contacts and probability) in order to adequately compare between CS and ITI periods. Briefly, we observed that STs demonstrated a higher probability to contact the lever during the CS period and over time (three-way interaction, F 3,266 = 3.84, p = 0.010, η 2 p = 0.04, 90% CI = 0.00 to −0.08). STs and non-STs differed in their probability to contact the reward between phases (phase main effect, F 1,266 = 504.79, p < 0.001, η 2 p = 0.65, 90% CI = 0.60 to 0.70) however there were no significant phenotypic differences during either phase or over time (three-way interaction, F 3,266 = 2.07, p = 0.104, η 2 p = 0.02, 90% CI = 0.00 to 0.05). Of note, there were significant main effects for age (F 1,36 = 6.62, p = 0.010, η 2 p = 0.16, 90% CI = 0.02 to 0.34) and sex (females higher; F 1,36 = 7.43, p = 0.009, η 2 p = 0.17, 90% CI = 0.03 to 0.35) in the probability to contact the reward which may implicate developmental differences impacting goal-tracking behaviors. Together, it appears that much of the individual variation in behavior stemmed from lever-directed behaviors during CS presentation which supports the general skew towards sign-tracking behavior.
We also examined PavCA behavior between phases for each phenotype over time using non-normalized metrics (Fig. 1c). PavCA scores showed significant main effects for phenotype (F 1,36 = 52.82, p < 0.001, η 2 p = 0.59, 90% CI = 0.42 to 0.71) and phase (F 1,265.1 = 1274.34, p < 0.001, η 2 p = 0.83, 90% CI = 0.80 to 0.85), suggesting that participants are behaviorally discriminating between phases, and STs are doing so to a greater degree. STs increasingly approached the lever over time (higher PavCA score; phenotype by time interaction, F 3,265.1 = 2.63, p = 0.050, η 2 p = 0.03, 90% CI = 0.00 to 0.06) and more so during CS presentation (phenotype by phase interaction, F 1,265.1 = 131.46, p < 0.001; three-way interaction, F 1,265.1 = 4.21, p = 0.006, η 2 p = 0.05, 90% CI = 0.01 to 0.09). This appears to reflect learning of the conditioned response and the characterization of sign-tracking by selectively increasing lever approach during the CS period. Conversely, non-STs progressively decreased their lever-directed behaviors over the course of training, further characterizing the distinction between phenotypic responses. PavCA showed a significant main effect for sex (males higher; F 1,36.1 = 13.28, p = 0.001, η 2 p = 0.27, 90% CI = 0.09 to 0.45) but not age (F 1,36 = 1.64, p = 0.208, η 2 p = 0.04, 90% CI = 0.00 to 0.19). Together, these results show that the two phenotypes selectively discriminate between phases and, during CS, PavCA behaviors continually diverge over the course of the session.  www.nature.com/scientificreports/ Externalizing behaviors by phenotype. The data presented so far provide evidence for a bias toward responding to reward-paired cues, indicative of sign-tracking, in a subset of youth. Given the well-established characteristic differences between STs and GTs in animal studies 14,21 , we aimed to validate the application of phenotypic distinctions in human youth. Our primary hypothesis was that human STs would demonstrate symptomatic and neurobiological profiles consistent with externalizing characteristics and a reliance on bottom-up processing rather than top-down cognitive control. We examined this using developmentally validated parentreport questionnaires (Child Behavior Checklist, CBCL 34 and Early Adolescent Temperament Questionnaire, EATQ 35 ). It is important to note that these measures do not represent clinical diagnoses, but dimensions of behavior associated with psychiatric symptoms. Measures were tested for normality and all CBCL subscales were log transformed. To directly compare measures between STs and non-STs, we used two-sided Welch two-sample t-tests (Supplementary Table S3; Fig. 2) with FDR p-value corrections used for multiple comparisons. Given that attentional control deficits are a hallmark characteristic in rodent STs 19,21 , and given the shared circuitry between sign-tracking rats 21 and humans with attention-based and impulse control disorders 36 , we expected elevated attention problems in the sign-trackers in our sample. Indeed, when compared to non-STs, STs showed increased symptoms of attention deficit/hyperactivity problems (CBCL; t 30.9 = −2.66, p = 0.014, d = 0.90, s.e. = 0.31, 95% CI = −1.45 to −0.19; Fig. 2a) which supports prior work highlighting differences in reward processing in attention disorders 37 .
Animal literature has further identified characteristic differences in fear responses, such that, when exposed to fear conditioning tasks, STs show exaggerated fear-associated cue reactivity consistent with an increased susceptibility to both substance use and post-traumatic stress 38,39 . When compared to non-STs, STs in our sample had increased reported symptoms of fear (EATQ; unpleasant affect related to anticipation of distress; t 35 = −2.79, p = 0.012, d = 0.90, s.e. = 0.18, 95% CI = −0.88 to −0.14; Fig. 2d), which is in line with previous animal findings implicating STs as having an increased vulnerability to fear-related responses to cues, regardless of context 39 .
Furthermore, inhibitory control deficits have been associated with sign-tracking in rats 18,20,40 as well as an increased vulnerability for impulse control disorders and addiction in humans 41 16,18,48 , this relationship should be further examined within a larger sample.
Beyond the translational utility of capturing sign-tracking tendencies in youth, we aimed to examine symptomatic differences between phenotypes consistent with human-specific psychopathology. Specifically, we expected to see behavioral tendencies in STs that are developmentally characteristic of externalizing disorders. In contrast to non-STs, caregivers reported STs as having increased oppositional defiant problems (CBCL; t 31 = −3.44, p = 0.006, d = 1.16, s.e. = 0.29, 95% CI = −1.58 to −0.40; Fig. 2b) and social problems (CBCL; including items relating to jealousy, not getting along with others, and not being liked by others; t 30.2 = −3.10, p = 0.008, d = 1.05, s.e. = 0.30, 95% CI = −1.56 to −0.32; Fig. 2c). Although these two subscales are positively correlated (r = 0.64, p < 0.001; Supplementary Table S4) suggesting homogenous traits, they are derived from independent items on the CBCL. STs also showed increased levels of negative affect (EATQ composite score of frustration, depressive mood, and aggressive behaviors; t 27.3 = −3.14, p = 0.006, d = 1.14, s.e. = 0.16, 95% CI = −1.89 to −0.22; Fig. 2f). Taken together, the association between the degree of sign-tracking and these measures support the notion that STs exhibit deficits in behavioral regulation and inhibitory control as well as a bias toward affective/reactive responding across multiple diagnostic criteria 42 . Neuroimaging. The results presented above demonstrate behaviors and tendencies specific to STs that may broadly indicate a reliance on bottom-up processing and are consistent with characteristics of both rodent models and theoretical accounts of translation of these paradigms to clinical populations 43 . To further elucidate the neurobiological processes that may contribute to the propensity to sign-track, we employed functional neuroimaging to measure reward processing. Participants completed the Monetary Incentive Delay (MID) fMRI task 44 to determine the brain response to a gain, no gain, or loss predictive cue. We performed a whole-brain voxelwise linear mixed effects analysis with fixed effects for group, condition, age, and sex, a group by condition interaction, and a random intercept for subject. We followed this with planned contrasts investigating a group (ST, non-ST) by condition (win vs neutral or loss vs neutral) interaction. Estimated marginal means were used for post hoc tests with Tukey p-value adjustment for multiple comparisons. A total of 29 subjects (n ST = 12, n non-ST = 17) survived motion correction.
When examining the win-neutral contrasts during the MID anticipatory phase, BOLD activations in the left inferior parietal lobe (IPL) showed a significant group by condition interaction (F 1,27 = 10.30, p = 0.003, η 2 p = 0.28, 90% CI = 0.07 to 0.48, Fig. 3a; see Supplementary Table S5 for details on coordinates and volumes for these and additional significant regions). Post hoc tests using average contrasts extracted from significant voxels in each ROI indicate that non-STs significantly increased activation from neutral to large win conditions (t 27 = −4.15, p = 0.002, d = −1.42, s.e. = 0.37, 95% CI = −2.19 to −0.66), indicating salience-dependent modulation within the IPL. Conversely, there was no evidence for a similar effect in STs and activations did not differ between conditions or from non-STs during the win condition (ps > 0.8). Importantly, this same modulatory pattern is evident in the left IPL for the loss-neutral contrast during the anticipatory phase (group by condition interaction, F 1,27 = 12.31, p = 0.002, η 2 p = 0.31, 90% CI = 0.09 to 0.51; Fig. 3b), which appears to indicate that this effect is not reward-specific but rather salience-driven. This same pattern is consistent across multiple cortical regions implicated in cognitive   Table S5) which may indicate that, in comparison to non-STs, STs do not actively engage cognitive control and the salience of the cue is cognitively irrelevant. During the win-neutral anticipation contrasts, BOLD activations in the right amygdala showed a significant group by condition interaction (F 1,27 = 9.04, p = 0.006, η 2 p = 0.25, 90% CI = 0.05 to 0.46; Fig. 3c), such that STs exhibited a significant increase in amygdala activation from neutral to large win conditions (t 27 = −3.05, p = 0.025, Cohen's d = −1.25, s.e. = 0.43, 95% CI = −2.12 to −0.37). These findings indicate salience-dependent modulation within the amygdala only for STs. In contrast to the IPL, amygdala activation for non-STs did not differ between conditions or from STs during either condition (ps > 0.1; see also Supplementary Fig. S5 for additional sensitivity analysis after removing a possible outlier in the ST group). Taken together, the associations between cortical activation during salience processing for non-STs and subcortical activation during salience processing for STs, support two independent processes that contribute to the emergence of sign-tracking and non-sign-tracking behaviors. Whereas non-ST individuals showed greater activation in brain areas that have been associated with executive control, ST youth showed greater activation in the salience network.
To further understand the possible implications of these differential neural modulatory patterns, we used Pearson's r to examine how percent signal change correspond to both externalizing behaviors and Pavlovian conditioned approach indices ( Fig. 4; Supplementary Table S6). Left IPL activation during win-neutral contrasts was negatively related to oppositional defiant problems (r = −0.43, p = 0.020), marginally negatively with negative affect (r = −0.34, p = 0.070), and marginally positively with inhibitory control (r = 0.36, p = 0.060), whereas left IPL during loss-neutral contrasts was negatively related to risk-taking behaviors (r = −0.45, p = 0.020). Although not all statistically significant, these associations support the notion that brain-behavior associations might differ by phenotype. We further examined the correlations between percent signal change and PavCA scores. Activations in both regions during their respective contrasts were significantly correlated with the propensity to sign-track such that higher PavCA scores (sign-tracking) were related to less activity in the left IPL during both win-neutral (r = −0.46, p = 0.010) and loss-neutral (r = −0.45, p = 0.010) as well as greater activity in the amygdala during win-neutral (r = 0.48, p = 0.010). Note that since the ROIs were selected based on differential activation  www.nature.com/scientificreports/ between STs and non-STs, these post-hoc correlations with clinical measures are inflated and would be smaller in an independent sample. These findings are in agreement with the pre-clinical literature, suggesting that the behavioral endophenotype of sign-trackers is dominated by subcortical motivational systems; whereas that of non-sign-trackers is dominated by cortical processes.
Environmental influences. Given that the stress response system directly influences both the dopaminergic reward system 47 and animal sign-and goal-tracking behaviors 7,48,49 , we also gathered information regarding environmental markers of adversity and protective factors including relationships and resources available to these youth 50 Table S3). Together, these findings support prior research addressing early life adversity as a potential influential driver for the divergence in phenotypic differences and associated vulnerabilities. While we did not have adequate power to do so, future studies with larger sample sizes would also benefit from examining the question whether social determinants of mental health are an important mediator or moderator for the expression of sign-or goal-tracking tendencies. Oppositional defiance = DSM-oriented subscale from the Child Behavior Checklist (parent report). Scores were log transformed due to non-normality. Externalizing problems = composite Child Behavior Checklist subscale (parent report); Inhibitory control = subscale from the Early Adolescent Temperament Questionnaire (parent report). Surgency = subscale from the Early Adolescent Temperament Questionnaire (parent report); Negative Affect = composite subscale including depressive mood, aggressive behaviors, and frustration on the Early Adolescent Temperament Questionnaire (parent report); Aggression = subscale from the Early Adolescent Temperament Questionnaire (parent report); Sensation seeking = subscale of the Urgency, Premeditation, Perseverance, Sensation Seeking, Positive Urgency scale (youth self-report

Discussion
This study translated a paradigm previously developed to examine sign-and goal-tracking behaviors in rodents in order to determine whether these behavioral distinctions (1) could be observed in human youth, and (2) are associated with externalizing behaviors or reward-related neural activation patterns. First, we present evidence for sign-tracking behaviors in pre-adolescents. Second, STs were distinctive in both externalizing traits and neurobiological patterns consistent with impulse control disorders. Specifically, STs showed greater externalizing characteristics and salience-dependent subcortical reactivity to reward cues, whereas non-STs had fewer externalizing characteristics and more actively engaged cortical control regions. Third, both externalizing traits and the degree of sign-tracking behavior correlated with neural modulation patterns such that those with higher BOLD activation in the left IPL also reportedly had fewer externalizing characteristics and had lower PavCA scores (non-STs) whereas those with increased BOLD activation in the amygdala also had higher PavCA scores (STs). Finally, those displaying a higher degree of sign-tracking behavior also reported increased potential for early life stress (lower income, increased basic needs unaffordability, and fewer protective factors). Taken together, the phenotypic differences in our sample are consistent with rodent models of sign-tracking and support prior findings from humans with impulse control disorders 5 .
The differences in parent-reported behavioral tendencies between STs and non-STs in our sample reflect characteristic patterns in animal studies that are mechanistically tied to dual-systems processing 5 . A fundamental characteristic of an incentive stimulus is its ability to bias attention and elicit approach behaviors suggestive of distinct cognitive/attentional control tendencies specific to this sign-trackers. In humans, attentional capture to reward-paired stimuli (resonant of sign-tracking behaviors) appears to be stronger for those with low cognitive control 51 and directly linked to both impulsivity 52 and substance use 51 . Thus, the presence of increased attentional and executive control problems in STs, in conjunction with cortical activation patterns, suggests that, the attribution of incentive salience to reward-paired cues is driven by inefficient top-down cognitive and executive control 19 . The other symptoms reported here for STs reflect patterns of behavior consistent with the profile of dopamine-driven cue-reactivity in externalizing disorders, namely, defiance, aggression, and impulsivity. Notably, lower activation in cortical regions (i.e., in STs) is also correlated with behavioral reports of increased risk taking, defiance, and to a lesser extent externalizing behaviors, aggression, and lower inhibitory control. Finally, STs were reported to have increased fear-related behaviors, a finding that is supported by rodent models 38,39 . This supports our finding that non-STs are not only reported to have fewer fear-related behaviors but also a neural modulatory profile indicative of emotion regulation and cognitive control.
Animal studies have shown a consistent double dissociation between top-down dominant cortical systems in GTs and subcortical cue-driven systems that dominate in STs 14 . Here, we found concurring evidence in humans that, in contrast to STs, non-STs cognitively modulate their anticipatory responses in the IPL (and consistently across multiple cognitive control regions) but not the amygdala according to cue salience. The IPL has been implicated in probability and reward-related decision making 53 and this pattern for non-STs may reflect cognitive discrimination in anticipatory and preparatory responses according to the salience of the cues. Whereas for STs, the non-modulation in the IPL appears to indicate indiscriminate cortical activation according to cue salience, suggesting that, cognitively, STs interpret all cues as salient and respond relatively equally across conditions. Thus, the selective modulation in this region by non-STs demonstrates distinct cognitive assessment of the cue and preparatory responses selectively applied to highly salient cues. STs do, however, affectively modulate subcortical (amygdala) activity in preparation for reward-paired cues. The amygdala contributes to contextual appraisal of rewards and motivational significance of incentives and is integral to neural processing of reward and reward cues 15 . Therefore, the apparent reliance for STs on primarily subcortical structures to modulate responses and the fact that this salience-dependent modulation does not translate to cognitive action, supports a theory of increased vulnerability to impulse control disorders based on both individual differences in incentive salience attribution and mechanistic differences in reward processing. Our findings are also consistent with a recent fMRI analysis of model-based vs model-free learning in humans demonstrating that increased incentive salience attribution was associated with neural profiles of striatal reward prediction error signals in STs; whereas stronger cortical signals of state-prediction errors were evident in GTs 28 . Together, STs in our sample are categorized both by behavioral Pavlovian responses to reward-paired cues and selectively modulated amygdala activity which points to a likelihood that these patterns are dopamine-dependent and supports previous pre-clinical animal reports of a dual-systems approach.
The dynamic nature of cortical development, top-down control, and neural organization in pre-adolescence 31 highlights the importance of considering developmental trajectories in light of both sign-tracking and externalizing tendencies. Specifically, adolescence is characterized by underdeveloped cortical downregulation of the amygdala that contributes to attentional/executive control deficits 31 . It is likely, therefore, that this developmental variability is, at least in part, impacting both the neuromodulatory patterns and the significant skew in the tendency to sign-track. For instance, the limited goal-tracking behaviors is a notable deviance from rodent models. While age did not significantly differ between phenotypes, given our sample constraints, examination of these traits in larger samples with varying age ranges would help to further elucidate whether this is an effect of developmental stage or another factor not accounted for in our translation of this paradigm. However, in the absence of a significant age effect in PavCA behaviors in our sample, as well as statistical controls for age in place in neuroimaging models, it remains striking that we measured marked individual differences in both brain and behavior. Further, the dynamic nature of both neural and behavioral traits during this age range, underscores the importance of measuring potential predictors of risk for psychopathology, as these characteristics may still be malleable. While more research is necessary to further clarify the developmental trajectory of sign-and goal-tracking and its impact on future psychopathology, measuring and characterizing these phenotypes early www.nature.com/scientificreports/ and longitudinally as they evolve throughout development, underscores the potential utility of this paradigm to delineate possible preventative and interventive measures. The individual differences in both the degree of sign-tracking and the development of impulse control deficits suggests the possibility of additional risk factors at play. Both sign-tracking 49 and impulse control disorders 47 are influenced by stressful early environments. Our finding that STs have lower income households, increased difficulty for affording basic needs, and less social support suggests probable environmental impacts on individual differences in sign-tracking tendencies and the underlying neurobiological mechanisms. The sensitivity of amygdala-frontal development in adolescence to stressful environmental input can bias the reward system to be more reactive to cues 54 and vulnerable to reward-seeking and substance use 55 . The finding that indicators of stressful environments are present more prominently in STs is consistent with this prior work and is supported by the corresponding psychopathological and neural profiles we observe in STs. However, this finding will need to be examined more fully in larger, more demographically diverse samples.
As a whole, these results provide evidence for sign-tracking in youth and a general neuromodulatory and behavioral profile consistent with externalizing traits; however, a number of limitations should be noted. First, our sample size was relatively small and demographically homogenous which limited our variance and power to perform multivariate analyses or complex neuroimaging models. With 40 participants, we were only powered to detect effects larger than roughly 75% of those seen in larger social psychology studies 56 . Additionally, while similarly sized neuroimaging studies have been the norm, reproducibility has been poor and there is even evidence that thousands of participants are necessary to conduct reproducible brain-wide association studies 57 . These two facts lead to relatively high risk of both type I and type II errors, a caveat that could be included in most similarly sized studies. In particular, future, higher powered studies would benefit from statistically addressing environmental factors including income and family history of psychopathology and substance use as well as individuallevel factors such as pubertal stage to further account for possible sex differences. Additionally, a larger cohort would provide the opportunity to address the more limited extremes of sign-and goal-tracking behaviors and further validate if these neural and behavioral distinctions reflect differences in risk for externalizing disorders. Further, our aim was to measure the potential of sign-tracking behaviors as an early biomarker and/or behavioral marker for the risk of later development of psychopathology in a healthy sample. We therefore cannot directly address the application of sign-and goal-tracking to clinical diagnoses or the potential range of sign-and goaltracking behaviors within a clinical sample from this study. Future research is needed to further investigate the direct relevance of sign-and goal-tracking to clinical symptoms. Also of note is our use of monetary rewards for both the PavCA paradigm and the MID task which could influence salience for those participants coming from lower income households. Finally, additional methods of measurement should be considered (e.g., eye tracking, approach behaviors) that could further elucidate whether goal-tracking is, in fact, limited in pre-adolescence, or simply not adequately measured in this sample.
The current data strongly support the feasibility and utility of the sign-tracker/goal-tracker model of reward processing as a useful experimental approach and construct to measure individual differences in the propensity to attribute incentive salience to reward cues in youth. Moreover, these findings reveal a consistent and substantial pattern of neuromodulatory and externalizing trait responses that reliably dissociate bottom-up processing in STs from top-down cognitive control in non-STs. These results provide evidence in human youth for an underlying dual-systems mechanism of neural reward processing between cortical and subcortical control and link the degree of sign-tracking to externalizing aspects of psychopathology that may predispose some individuals to the development of impulse control disorders. The feasibility of this paradigm and the initial delineation of the circuitry associated with the degree of sign-tracking is enhanced by the extensive knowledge base obtained from prior animal studies, which provide neuroscience-based rationale for future studies elucidating the underlying mechanisms of risk for externalizing psychopathology. While there is still more to be discovered, particularly regarding the developmental trajectory, stability, or malleability of these phenotypes in humans, these data offer promise in identifying sign-and goal-tracking phenotypes in humans and present initial evidence detailing corresponding neural profiles and behavioral traits consistent with indicators of impulse control disorders, helping to pave the way for future research with clinical applications.

Materials and methods
Participants and sample selection. The sample consisted of 9-12-year-old youth (N = 40, m = 9.6 years, sd = 0.93; Table 1). All tasks and measures were used in the whole sample. Participants were excluded if they received a diagnosis of a severe learning disorder, Axis 1 psychiatric disorder, Attention Deficit/Hyperactivity Disorder or Pervasive Developmental Disorder (e.g., Autism Spectrum Disorder) as these conditions may affect attentional control and possibly bias the measurement of attention to each stimulus, or if they endorsed MRI contraindications including non-correctable vision, hearing, or sensorimotor impairments, claustrophobia, large body size, or irremovable ferromagnetic metal implements or dental appliances. Participants were recruited via fliers and online advertisements. Youth participated with one parent/caregiver present and all participants were compensated for their time. Caregivers completed questionnaires including demographics, youth behaviors, and family environment. Youth completed the sign-and goal-tracking task at the beginning of each session followed by self-report questionnaires, behavioral and neurocognitive tasks, practice tasks for imaging sessions, and neuroimaging. All study procedures were approved by and performed in accordance with relevant guidelines and regulations of the Western Institutional Review Board. Informed consent and written permission for youth participation (including consent for publication of images/videos) were obtained from caregivers and informed assent was obtained from youth. An a priori power analysis using G*Power software 58 was used to estimate the appropriate sample size based on the index of Pavlovian conditioned behavior found in an existing study of sign-tracking and goal-tracking in humans 27 . We applied power estimation procedures based on these www.nature.com/scientificreports/ values, a minimum effect size of 0.91, and assumed 2-tailed alpha of 0.05. This analysis indicated that a sample of 40 would be sufficient to detect an effect with 80% power. While under-powered for more complex analyses involving additional covariates, these results will be critical for estimating effect sizes with the caveat that effects reported here are likely overestimated. The lower bound on confidence intervals reported here may be appropriate, conservative estimates to use when planning future research. In addition to validation of the sign-tracker/goal-tracker paradigm, the purpose of this study was to identify behaviors and neurological profiles associated with sign-tracking phenotypes in youth. The age range for this sample was selected for multiple reasons. First, the average age of onset for externalizing disorders is 11 years old and many externalizing symptoms emerge during this age range 2 . Second, symptoms of a range of psychiatric conditions (e.g., depression, anxiety, substance use) typically begin to emerge later in adolescence 59 , making pre-adolescence a prime target for mental health screening and development of preventative interventions. Further, the timing of prefrontal cortex development, responsible for decision-making and higher-order executive functions 31 , pre-adolescent youth may be more likely to engage in behaviors influenced by enhanced motivational drive 60,61 than adults, likely impacting the detection of sign-tracking or goal-tracking phenotypes in humans. And finally, we used this age range in an attempt to maximize the quality of neuroimaging data collection in youth.
Sign-and goal-tracking apparatus and paradigm development. The Pavlovian conditioning apparatus used is described in Joyner et al. (2018) and was built to mimic the animal model as closely as possible. The apparatus consisted of two solid-colored response boxes, built to look like building blocks and be appealing to youth but not inherently rewarding. The boxes were: (1) the CS box containing a lever, which illuminated and extended from the box, and (2) the US box containing a small metal tray into which the reward was dispensed. In the CS box, a linear actuator, consisting of a 12 V DC motor (Bühler Motor, GmbH, Nuremberg, Germany) and a worm drive, was used to extend and retract the lever. In the US box, a reward dispenser with infrared sentry (Med Associates, Inc, Fairfax, VT) was used to release the reward. A touch sensor, based on a field-effect transistor, was used to detect a participant's touches to the metal reward tray. The hardware system was operated via an Arduino UNO microcontroller (Arduino, LLC, Somerville, MA, www. ardui no. cc). The microcontroller was programmed to provide three output signals, controlling the lever movement, the lever light, and the reward dispenser. Two Arduino inputs were programmed to record the lever presses and the reward tray touches. The experimental protocol was implemented in custom software written in MATLAB 62 (MathWorks, Inc, Natick, MA, www. mathw orks. com). The code was run by a researcher on a laptop communicating with the Arduino via a USB connection. Response times (in ms) corresponding to all lever presses and reward tray touches in each trial were saved to a file. The CS and US boxes were spaced approximately 12 inches apart to reduce the likelihood of participants simultaneously engaging with both. The left/right positioning of the boxes was counterbalanced between participants to minimize lateral bias.
Response boxes were covered prior to each session and uncovered simultaneously for a demonstration that was given in counterbalanced order at the beginning of each session. To demonstrate the lever box, researchers guided participant attention to the box and manually commanded the program to extend and retract the lever. To demonstrate the reward box, researchers manually commanded the program to dispense one token. Participants were instructed that each token was worth $0.20 and that tokens were theirs to keep and exchange later for money or a prize. During demonstration, participants were encouraged to touch each box at least once and were instructed that during the session, they could interact with either box in any way and as much as they liked. Participants were then instructed to move to the neutrally marked location in between both boxes to begin the session.
The structure of the Pavlovian conditioning paradigm was primarily modeled after rodent studies described by Flagel et al. 32 . This paradigm consists of multiple response-independent trials during which a lever (CS) illuminates and extends, and upon retraction, a reward portion (US) is dispensed. A randomly selected inter-trial interval (ITI) follows each trial, after which the next trial begins. Each of these actions occurs for every trial regardless of subject input or response. In typical animal models 32 , conditioning sessions consist of 25 trials lasting 30-45 min and occur across multiple days. Modeled after Joyner et al. 30 , we condensed training into a single session to make this more feasible for our population and avoid participant burden (e.g., multiple trips to the lab). Within this session, we conducted four blocks of ten trials each, with the total session lasting approximately 20-30 min. Each trial consisted of the lever illuminating and extending for 8.3 s, and upon retraction, the token reward (US) was dispensed into the tray. The ITI period was programmed to last for a randomly selected time, either 8, 16, 24, or 32 s. Each block was followed by a "wiggle break" lasting up to 45 s, during which the child was given the opportunity to relax and reset before the next trial. This setup was intended to maximize participant attention and minimize fatigue, while retaining enough length in the session for associative learning to occur. In order to capture the number of and latency to contacts to each stimulus, we measured these behaviors in line with the traditional measurement of rodent sign-and goal-tracking. The MATLAB 62 program controlling the apparatus recorded the number and timing of contacts to the CS lever and US reward tray during CS presentation and ITI (Fig. 1).
We measured sign-tracking behaviors via a Pavlovian Conditioned Approach (PavCA) index based on previously used models in animal studies 33 derived from the number and timing of physical contacts to the CS and US during CS-presentation and ITI phases. This index was calculated for each trial and consists of the average of three measures: response bias, probability difference score, and latency difference score. Response bias is the probability of contacting the CS versus the US and calculated as: (lever contacts − reward contacts)/(lever contacts + reward contacts). Probability difference score is the probability of contacting the CS minus the probability of contacting the US and calculated as: (lever contacts/(lever contacts + reward contacts)) − (reward contacts/(lever contacts + reward contacts)). Finally, latency difference score is the latency to contact the US minus www.nature.com/scientificreports/ latency to contact the CS for each trial and calculated as: (latency to reward contacts in ms * 0.001 − latency to lever contacts in ms * 0.001)/8.3 s. To estimate the reliability of the PavCA measure, we used a two-way intraclass correlation between even and odd trials for each participant. The reliability was excellent (ICC = 0.97, F(39,39.7) = 56.7, p < 0.001, CI = 0.94 to 0.98).
To assess learning of conditioned responses to the CS, we examined behavioral responses to the CS using twosided linear mixed effects models with the factors time (Block 1-4), phase (CS, ITI), and phenotype (ST, non-ST) for lever-and reward-directed behaviors (Supplementary Fig. S2) and PavCA (Fig. 1c). These models included age and sex as covariates and a random intercept for subject. For post hoc tests we used estimated marginal means. We investigated behaviors both during the CS periods (as seen in animal models) and ITI periods because, unlike animal models, our sample displayed lever-directed behaviors, albeit low levels, during the ITI-period (i.e., when the lever was retracted; CS lever contacts, m = 4.06, sd = 5.73; ITI lever contacts, m = 0.59, sd = 1.53). This may indicate an effect of investigative or exploratory behaviors common during this developmental stage 63 or limited attentional capacity during the longer ITIs and will be an important methodological consideration in human translation of this paradigm in future studies. Further, because human subjects attempted to interact with the lever even during the ITI (i.e., when it was retracted), and because CS lever contacts correlated positively with ITI lever contacts (r = 0.59, p < 0.001; Supplementary Table S2), a normalized frequency score was calculated for lever-and reward-directed behaviors (contacts and probability) in order to adequately compare between CS and ITI periods. To normalize, we divided each measure by the length of their respective phase. A non-normalized score was used for PavCA scores in the main text.
When determining an appropriate reward to use as the US, we took several factors into consideration. First, pilot testing primary food rewards in this age group (chocolate candies, as in Joyner et al. 30 ) demonstrated very minimal incentive toward the reward. Therefore, in order to choose an item that would be (1) rewarding to the study population and (2) small and consistent in shape to both adequately dispense from the machine and reflect the physical characteristics of the candies used in Joyner et al. 30 , we ultimately decided to use colorful wooden beads. Participants were informed that these tokens were worth $0.20 cents each and could be exchanged for their choice of prizes or money (totaling $8) at the end of the session. The use of a secondary reinforcement rather than primary is a notable deviation from rodent models and Joyner and colleagues 30 , however, we believe the benefit of increased participant engagement and reward motivation made this justifiable.
Measures. Parent-reported psychopathology. Two parent-report measures that have been developmentally validated for 9-12-year-olds were used to identify youth psychopathology symptoms. First, the Early Adolescent Temperament Questionnaire (EATQ-R) 35 is a 62-item parent-report measure of youth temperament and selfregulation in 9-15-year-olds. Subscales include effortful control (measures of attention, inhibitory control, and activation control), surgency (measures of surgency, fear, and shyness), negative affect (measures of aggression, frustration, and depressive mood), and affiliativeness. Second, the Child Behavior Checklist (CBCL) 34 measures dimensional psychopathology and is normed by age, sex, and ethnicity. It is a parent-reported youth behavioral questionnaire that includes eight empirically-based subscales for anxious/depressed, withdrawn/depressed, somatic complaints, social problems, thought problems, attention problems, rule breaking, and aggression as well as DSM-oriented subscales for attention deficit-hyperactivity, affective, anxiety, somatic, conduct, and oppositional defiant problems.
Family demographics and functioning. Parents responded to demographic questions from the PhenX survey toolkit 64,65 including household income and 7 items addressing economic adversity and basic needs unaffordability. Additional environmental influences included in our analyses were markers of family environment including the Protective and Compensatory Experiences Scale (PACEs) 50 , a 10-item parent-report scale measuring factors contributing to resiliency in childhood including relationships and resources available to youth. Finally, parents reported on the family environment with the 90-item Family Environment Scale 66,67 . Items were true/false and subscales for cohesion, organization, recreational activities, and conflict were included.
Youth self-report and neurocognitive functioning. Youth reported impulsive and inhibitory behaviors using the Urgency, Premeditation, Perseverance, Sensation Seeking, Positive Urgency (UPPS-P), Impulsive Behavior Scale, and the behavioral inhibition system/ behavioral approach system (BIS/BAS) 68 scale. A modified version of the UPPS-P from PhenX for children was used 64,69 consisting of 20 self-report questions addressing youth impulsive behaviors including subscales for negative and positive urgency, lack of premeditation and perseverance, and sensation seeking. An abridged version of the BIS/BAS scale was used including 24 items with subscales for drive, fun seeking, reward responsiveness, and inhibition 64,68 . Youth neurocognitive functioning was measured using the National Institutes for Health Toolbox Neurocognitive Battery specified for ages 7-17 70,71 . All Toolbox tests were administered measuring executive functioning, episodic memory, language, processing speed, working memory, and attention. For analysis purposes we used the age-corrected composite scores of fluid and crystalized cognitive functioning.
Neuroimaging. Head motion prevention. To prevent and minimize motion: (1) participants watched an age-appropriate informational video explaining MRI safety and the importance of staying still; (2) prior to the scan session, participants completed motion compliance training in mock scanners using head motion detection/feedback; (3) the head was stabilized in head coils; and (4) the MR technologist modeled relaxation techniques. Youth vision was screened and MRI-safe corrective lenses were provided if necessary.  44 . Each trial began with a cue presented for 2 s indicating the trial type, with a pink circle indicating a potential gain of $5 or $0.20, a blue triangle indicating no gain or loss, and a yellow square indicating a potential loss of $5 or $0.20 so that there were 5 trial types: high-loss, low-loss, neutral, low-win, and high-win. A 1.5 to 4 s fixation followed the cue, and then a black target was presented. Participants were instructed to press a button while the target was on the screen, with target duration varying between 0.15 and 0.5 s. On gain trials, participants were rewarded for successfully hitting the target and would neither gain nor lose money for missing it. On loss trials, participants lost the indicated amount when they missed and neither gained nor lost when they hit the target. The outcome of each trial was presented immediately after each response and lasted 2 s minus the duration of the target. Target duration was initialized based on performance during a practice session completed outside the scanner and updated during scanning so that participants would succeed on approximately 60% of trials, leading to mean earnings of $20 with a maximum of $60. Each run contained 50 trials (10 of each trial type) and lasted 5 min 42 s (Supplementary Fig. S3). BOLD imaging during the MID task took place on two identical GE MR750 3 T scanners using multiband acquisition with an acceleration factor of 6 and the following parameters: 60 axial slices, TR/TE = 800/30 ms, FOV/slice = 216/2.4 mm, 90 × 90 matrix producing 2.4 mm isotropic voxels, 419 volumes for 5 min 35 s of scan time per run. Additionally, high-resolution structural images were obtained through a 3D sagittal T1-weighted magnetization-prepared rapid acquisition with gradient echo sequence (TR/TE = 6/2.92 ms, FOV/ slice = 256 × 256/1 mm, 208 sagittal slices).
fMRI data preprocessing. fMRI data were preprocessed using the Analysis of Functional Neuroimaging (AFNI, http:// anfi. nimh. nih. gov) software 72 . Steps included removal of the first 10 volumes to allow for signal stabilization, despiking, slice timing correction, co-registration with the anatomical volume, motion correction, nonlinear warp to MNI space with resampling to 2 mm isotropic voxels, scaling to percent signal change, and application of a 4 mm Gaussian FWHM smoothing kernel. A general linear model was applied with regressors for the six motion parameters, three polynomial terms, and 2-s block regressors for each of the five trial types. Censoring was applied at the regression step so that any TRs with the Euclidean norm of the six motion parameter derivatives greater than 0.3 were removed, along with TRs where greater than 10 percent of brain voxels were outliers (estimated with 3dToutcount). The estimated beta coefficients from this single subject analysis were taken to the group level and are interpreted in terms of percent signal change for each condition.
Statistical analysis. Analyses were conducted in the R System for Statistical Computing 45 . Feasibility of the sign-and goal-tracking task was assessed in multiple ways. First, frequency counts were used to identify qualitative and quantitative levels of child engagement in the paradigm and how well the participants tolerated the task. Second, we assessed behavioral responses during CS and ITI periods using means and standard deviations of each behavior by block. Additionally, Pearson r correlations using the psych package in R 73 were used within behaviors to verify measurement of lever-or reward-directed behaviors. Blocks 1 and 2 were identified as a training phase, therefore the behaviors from Blocks 3 and 4 were averaged for each PavCA index score. To classify categorical phenotypic groups, we used a PavCA value of 0.5 or greater to define sign-tracking and less then −0.5 to define goal-tracking, consistent with categorization used in animal models 74 . Behavioral differences by phenotype were assessed using two-sided Welch two-sample t-tests in the R package stats 45 . Finally, we assessed the demonstration of learning a conditioned response to the CS by examining behavioral responses to the CS over all four Blocks (40 trials total). We used linear mixed effects models using the R package lme4 75 to test for three-way interactions between time (Block 1-4), phase (CS, ITI), and phenotype (ST, non-ST) for lever-and reward-directed behaviors. The R package emmeans 76 was used to assess planned post hoc comparisons. Error bars in figures represent standard error of the mean calculated in the Rmisc 77 package in R.
Recent research using animal models has shown that ITI duration may also impact the likelihood of displaying each CR and sign-tracking behavior appears to be more likely during a longer ITI periods 78 due to a weakened association between contacting the location of the US and receiving a reward. Additionally, the responses during ITI periods in our sample were relatively high compared to typical animal behaviors, therefore, to adequately compare between CS and ITI periods, scores were normalized by dividing each score by the length of each respective phase (8 s for CS, and 8, 16, 24, or 32 s for each ITI).
Behavioral outcome measures. We conducted Welsh independent samples t-tests using the R package stats to examine phenotypic differences in symptoms and environmental variables. Tests were p-value corrected for multiple comparisons using the FDR corrections. All variables were tested for normality and transformed using the optLog 79 package in R where necessary. CBCL subscales were all log transformed. Whole brain analyses. We performed a whole-brain voxelwise linear mixed effects analysis using AFNI's 3dLME 80 with fixed effects for group, condition, age, and sex, a group by condition interaction, and a random intercept for subject. We followed this with planned contrasts investigating a group (ST, non-ST) by condition (high-win vs neutral or high-loss vs neutral) interaction. 3dFWHMx (with the newer -acf option) was used to estimate the smoothness of the residuals of the group model, and this smoothness was used along with 3dClust-Sim to perform cluster-wise correction. The ACF parameters were estimated to be 0.09, 6.63, and 1.36 which indicates an effective FWHM smoothness of 2.8 mm. Significant clusters are reported using a voxelwise p-value threshold of 0.005 and α < 0.05 at the cluster level (N = 11.73 voxels). www.nature.com/scientificreports/ with data processing and technical assistance, and E. Pribil for data collection. The funders had no role in the study design, data collection or analysis, decision to publish, or preparation of the manuscript.